Performance Analytics

Diagnostics and Data Quality Analytics for Air Quality Monitoring

Steve Crawshaw

2024-06-11

Contents

  • Business Context and Problem
  • Data Quality Requirements
  • Data Sources
  • Data Transformation and Analysis
  • Reproducible Pipelines
  • Reporting and Impact
  • Recommendations

Summary

  • A Reproducible Analytic Pipeline (RAP) to collate, summarise and report the key parameters of an air quality monitoring network to provide timely and evidenced data quality assurance and predictive risk mitigation.

  • The audience is a small team of air quality specialists in a council supporting a network of air quality monitors.

Business Context: Air Quality Management

Business Problem

  • Clean Air Zone (£44m)
    • Highly politically sensitive
    • Relies on air quality data
    • Prescriptive QA regime
    • But human and system error creates data quality risk
  • Solution
    • Reproducible pipeline for data quality analysis

Data Quality Requirements

  • Data collection better than 85% for hourly data
  • 4G data use below 3GB per site
  • Identification of analyser system faults
  • Instrument calibration
  • Data scaled to span gas
    • Identify gas issues asap

Data Sources

  • Teledyne API T200 NOx analysers
  • EnviDAS FW data loggers (SQL Server)
  • Teltonika 4G routers
    • RMS platform
  • Azure SQL Server (all sites)
  • Google sheets (calibration data)

Data Analysis - R packages

  • DBI (database access)
  • config (secure parameters)
  • tidyverse (wrangling)
  • openair (air quality)
  • gt (tables)
  • targets (pipeline)
  • Quarto (reporting)
  • googlesheets4
  • rlist (list processing)
  • httr2 (REST API)

Data Analysis - Diagnostics

  • Air quality data
    • Missingness
  • Instrument diagnostics
    • Operating parameters
      • Temperature
      • Flowrates
      • Pressure

Data Analysis - Calibration

  • Zero and Span
    • red lines indicates optimal values

Data Analysis - Calibration

  • Span divergence
    • NO and NOx
    • Gas contamination
    • NO is oxidised to NO2
    • Affects span and accuracy

Data Analysis - Telemetry Data

Reproducible Analytical Pipelines

  • targets package
  • RAP automates function execution
  • Skips redundant steps
  • Encourages functional style

Reporting

  • Monthly report
  • Select start and end dates
  • Rendered in html by Quarto
  • Quarto reads the objects in the targets store
  • Some minimal formatting with gt package
  • HTML output can be shared with engineers

Impact

  • Data quality risks identified and mitigated
  • Diagnostics report to engineers produced automatically
  • Predictive response to gas contamination issues

Business Recommendations

  • Extend analytics with dashboard
  • Investigate:
    • BAM diagnostics (PM10, PM2.5)
    • Data logger diagnostics
  • Consolidate with data ratification
  • Add public reporting
  • Include alerts for key parameters
  • Write R package & containerise

Questions?